perm filename READA[B2,JMC]1 blob sn#762063 filedate 1984-07-17 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00009 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	\chapter{INTRODUCTION TO LISP}
C00005 00003	\section{Lists.}
C00011 00004	\section{Atoms.}
C00014 00005	\section{List structures.}
C00022 00006	\section{S-expressions.}
C00029 00007	\subsubsection*{Exercises}
C00042 00008	
C00043 00009	
C00044 ENDMK
C⊗;
\chapter{INTRODUCTION TO LISP}
\label{READIN}


	\lisp\ is a language for writing programs that compute with symbolic 
expressions.  Internally symbolic information is represented by S-expressions.
A \lisp\ program can also be represented as an S-expression.  
This gives \lisp\ the special ability to easily manipulate
programs as well as other sorts of symbolic data.  
In this chapter we describe notation for lists and
S-expressions, show how they are represented in a computer as list structures,
and describe the basic functions and predicates on the domain of S-expressions
in terms of the computer representation.
We then describe how the 
basic constructs of \lisp\ are combined to form \lisp\ programs.

	In the first four sections we will describe lists, atoms
and S-expressions in terms of their representation as strings of
characters and as list structures in the memory of a computer.
\secref{basicfns} describes the basic functions and predicates.
In \secref{asexp} we will introduce a more abstract point of view in
which S-expressions are regarded as built up from an arbitrary
set of atoms by an abstract pairing operation.
From this point of view, the main properties of \lisp\ data and
the functions used to compute with them are independent of any
concrete representation, whether by strings or by list structures.
\section{Lists.}
\label{lists}

	We begin with some examples.  The most common form of S-expression 
is the list. \figref{f1} shows some examples of lists.
The list (i) has four elements, one of which is repeated.
The list (ii) has four elements one of which is itself a list.
The list (iii) has one element.
The list (iv) also has one element which itself is a list.
The list (v) has no elements; it is also written \qNIL.

\begin{figure}
\label{f1}

$$\vbox{%
\ialign{\hfil#&\quad\sx #\hfil\cr
(i) & (A B A E)\cr
(ii) & (A B (C D) E)\cr
(iii) & (A)\cr
(iv) & ((A B C D))\cr
(v) & ()\cr
}}$$

\caption{Examples of lists.}
\end{figure}


	\figref{f2} shows how symbolic information can be represented by lists.  
Each example consists of a list and a ``mathematical'' representation of 
the same information.  Examples (i)-(iv) represent familiar mathematical 
and logical expressions.  The list (v)
is used to represent the network or graph shown according to a scheme
whereby there is a sublist for each vertex consisting of the vertex
itself followed by the vertices to which it is connected.

	The elements of a list are surrounded by parentheses and
separated by spaces.  A list may have any number of terms and any of
these terms may themselves be lists.  In this case, the spaces
surrounding a sublist may be omitted, and extra spaces between
elements of a list are allowed.  Thus
{\sx (A B(C\ \ D)\ \ \ \ E)} and {\sx (A B\ (C\ D)\ E)}
are slightly different ways of writing the same list.

\begin{figure}
\label{f2}

\ialign{\qquad\hfil#&\qquad#\hfil\cr
(i) & {\sx (+ X Y)}\cr
    & $x + y$\cr
\noalign{\smallskip}
%
(ii) & {\sx (+ (* X Y) X 3)}\cr
     & $xy + x + 3$ \cr
\noalign{\smallskip}
%
(iii) & {\sx (EXIST X (ALL Y (IMPLIES (P X) (P Y))))}\cr
      & $∃x.∀y.[P(x)\limp P(y)]$ \cr
\noalign{\smallskip}
%
(iv) & {\sx (INTEGRAL 0 INFINITY (* (EXP (* I X Y)) (F X)) X)}\cr
     & $\int↑{\infty}_0 e↑{ixy}f(x)dx$\cr
\noalign{\smallskip}
%
(v) & {\sx ((A B) (B A C D) (C B D E) (D B C E) (E C D F) (F E))}\cr
    &  ??? \cr
\noalign{\smallskip}
}

%  .BEGIN "graph" SELECT D; 
%  .PREFACE 0
%  .MILLPREFACE ← 0
%  .SKIP 
%                                                                                   
%                                                                                   
%                                                                                   
%                                     ∂(40)C                                        
%                                   ∂(38)≤'~`≥                                       
%                                 ∂(36)≤'  ~  `≥                                     
%                             ∂(32)B ≤'    ~    `≥ E                                 
%                    ∂(23)A αααααααα∂(33)'∂(33)≥∂(40)~∂(47)≤∂(47)`αααααααα F          
%                               ∂(34)`≥    ~    ≤'                                  
%                                 ∂(36)`≥  ~  ≤'                                   
%                                   ∂(38)`≥~≤'                                     
%                                     ∂(40)D                                         
%  
%  .SKIP 
%  .END "graph"

\caption{Information represented as lists.}
\end{figure}

\section{Atoms.}
\label{atoms}

	In this section we describe the simpler kinds of atoms
and their representation by sequences of characters.  It would
be confusing at this point to introduce all the different kinds
recognized by \clisp\ systems.  What we will say about atoms
in this section is correct, we think, but is somewhat delicately
worded to avoid both misstatement and premature complication.

	The expressions {\sx A}, {\sx B}, {\sx X}, {\sx Y}, {\sx 3},
{\sx +}, and {\sx ALL} occurring in the
lists in \figref{f1} and \figref{f2} are
all atoms.  More generally, an atom is written as a
sequence of capital letters, digits, and special characters with
certain exclusions.  The exclusions are <space>, <carriage return>,
and the other non-printing characters, and also the parentheses,
brackets, semi-colon, and comma.  Numbers are expressed as signed
decimal or octal numbers, the exact convention depending on the
implementation.  Floating point numbers are written with decimal
points and, when appropriate, an exponent notation depending on the
implementation.  The reader should consult the programmer's manual
for the \lisp\ implementation he intends to use.
Some further examples of atoms are shown in \figref{f2a}.
From such atoms we can form lists like
{\sx ((345\ 3.14159\ -47)\ A307B\ THE-LAST-TRUMP\ -45.21)}.

\begin{figure}[h]
\label{f2a}

$$\vbox{\smallskip\halign{\sx # \hfil\cr
  THE-LAST-TRUMP\cr
  A307B         \cr
  345           \cr
  -47           \cr
  -45.21        \cr
  3.14159       \cr
}}$$

\caption{Examples of atoms.}
\end{figure}

\section{List structures.}
\label{liststruc}

	Atoms and lists are represented in the memory of the computer by 
list structures.  A list structure is a collection of memory units called 
{\it  cons-cells}.   
A cons-cell generally consists of one or possibly two consecutive computer
words.  It is divided into two parts, the a-part and the d-part,
with each part being capable of containing an address in memory. 
The list structure representing a list contains a cons-cell for each
element of the list.  The a-part of the cell contains the address
of the list structure representing the element and the d-part contains
the address of the cell for the next element of the list.
These addresses are sometimes called pointers as they ``point'' to 
the component list structures.

A diagram shows this more clearly than words. \figref{f3} shows
the list structure corresponding to the list {\sx (+ (* X Y) X 3)} 
(which might represent the expression $xy + x + 3$).  

\begin{figure}
\label{f3}

$$ ?? {\sx (+ (* X Y) X 3)} ??$$ 
%  
%  	    ⊂αααπααα⊃     ⊂αααπααα⊃     ⊂αααπααα⊃     ⊂αααπααα⊃
%        ααααα→~   ~   εαααα→~   ~   εαααα→~   ~   εαααα→~   ~   εααααααα⊃
%  	    %απα∀ααα$     %απα∀ααα$     %απα∀ααα$     %απα∀ααα$       ~
%  	      ↓             ~             ~             ↓             ~
%  	     +           ~             ~             3             ~
%  			    ~  ⊂αααπααα⊃  ~  ⊂αααπααα⊃     ⊂αααπααα⊃  ~
%  			    %α→~   ~   εααβα→~   ~   εαααα→~   ~   ~  ~
%  			       %απα∀ααα$  ~  %απα∀ααα$     %απα∀απα$  ~
%  				 ↓        ~    ~             ↓   ~    ~
%  			       *      %απαα$             Y   %ααπα$
%  					    ↓                       ↓
%  					    X                      NIL

\caption{Representation of {\sx (+ (* X Y) X 3)} as a list structure.}
\end{figure}


	A program refers to a list by the address of the cell for its first
element.  According to this convention, we see that the a-part of this
cell is the first element of the list and the d-part is a pointer to a
sublist formed by deleting the first element.  Thus the first word of
the list structure of \figref{f3} contains a pointer to the list
structure representing the atom {\sx +},  while its d-part points to the
list {\sx ((* X Y) X 3)}.  The second word contains the list structure
representing {\sx (* X Y)} in its a-part and the list structure
representing {\sx (X 3)} in its d-part.  The last word points to the atom
{\sx 3} in its a-part and has a pointer to the atom \qNIL\ in its d-part.
This is consistent with the convention that \qNIL\ represents the null list.

	Atoms are represented by the addresses of their property
lists which are list structures of a special kind depending on the
implementation.  
(In some implementations, the first word of a
property list is in a special area of memory, in others the first word
is distinguished by sign, in still others it has a special a-part.
For basic \lisp\ programming, it is enough to know that atoms are
distinguishable from other list structures by a predicate called
\qat.) 
Each atom is represented uniquely, .e.g there is only one pointer
that prints as {\sx A}.  In the diagram, we don't give the structure
of property lists but just label them with their {\it print names}.

A partial dump of the memory of a computer containing the list
structure of \figref{f3} 
might look as shown in \figref{f4}.
Here X denotes the address of the property list of the atom
named {\sx X} and /3 denotes that of the atom named {\sx 3}.
Notice that the addresses of consecutive list elements need 
not be consecutive.
Also the last word of a list cannot have the address of a next
word in its d-part since there isn't any next word, so it has the
address of a special atom called \qNIL.

\begin{figure}[h]
\label{f4}

$$\vbox{%
\halign{\qquad \hfil # &\quad\hfil # &\quad \hfil # &
        \qquad \hfil # &\quad\hfil # &\quad \hfil # \cr
%
	address &  a-part &  d-part &      address &  a-part &  d-part \cr
\noalign{\vskip-\smallskipamount}
      \hrulefill&\hrulefill&\hrulefill& \hrulefill&\hrulefill&\hrulefill \cr
\noalign{\medskip}
	    86 &    * &     87 &         1000 &    + &    1001 \cr
	    87 &        X &     88 &         1001 &      86 &    1002 \cr
	    88 &        Y &    NIL &         1002 &       X &    2653 \cr
	       &          &        &         2653 &      /3 &     NIL \cr
}}$$

\caption{A memory map for the list structure {\sx (+ (* X Y) X 3)}.}
\end{figure}

\section{S-expressions.}
\label{sexp}

	When we examine the way list structures represent lists we
see a curious asymmetry.  Namely, the a-part of a list word can
contain an atom or a list, but the d-part can contain only a list or
the special atom \qNIL.  This restriction is quite unnatural from the
computing point of view, and we shall allow arbitrary atoms to
inhabit the d-parts of words, but then we must generalize the way
list structures are expressed as character strings. To do this, we
introduce the notion of S-expression.

	An S-expression is either an atom or a pair of S-expressions.
To write a non-atomic S-expression we write the pair of subexpression
separated by a dot ({\sx\ .\ }) and surrounded by parentheses.  
\figref{f5a} shows some examples of S-expressions.

\begin{figure}[h]
\label{f5a}

$$\vbox{\smallskip\halign{\sx # \hfil\cr
  A\cr
  (A . B)                  \cr
  (A . (B . A))            \cr
  (+ . (X . (Y . NIL))) \cr
  (3 . 3.4)                \cr
}}$$

\caption{Examples of S-expressions.}
\end{figure}



In writing an S-expression, 
the spaces around the ``{\sx \ .\ }'' may be omitted when this will not cause
confusion.  The only possible confusion is of the dot separator with
a decimal point in numbers.  Thus, in the above cases, we may write 
{\sx (A.B)}, {\sx (A.(B.A))}, and {\sx (+.(X.(Y.NIL)))}, but if we wrote 
{\sx (3.3.4)} confusion would result.

	In the memory of a computer, an S-expression is represented
by the address of a cons-cell whose a-part points to the first element of
the pair and whose d-part points to the second element of the pair.
Thus, the S-expressions {\sx (A.B)}, {\sx (A.(B.A))}, and {\sx (+.(X.(Y.NIL)))} 
are represented by the list structures of \figref{f5}.

\begin{figure}
\label{f5}

$$ ?? {\sx (A.B)},  {\sx (A.(B.A))}, and {\sx (+.(X.(Y.NIL)))} ?? $$

%  	    ⊂αααπααα⊃                             ⊂αααπααα⊃     ⊂αααπααα⊃
%         αααα→~   ~   ~                        αααα→~   ~   εαααα→~   ~   ~
%  	    %απα∀απα$                             %απα∀ααα$     %απα∀απα$
%  	      ↓   ↓                                 ~             ↓   ~
%  	      A   B                                 ~             B   ~
%  						    ~                 ~
%  						    ~                 ~
%  						    %ααααααααπαααααααα$
%  							     ↓
%  							     A
%  
%  	    ⊂αααπααα⊃     ⊂αααπααα⊃     ⊂αααπααα⊃
%         αααα→~   ~   εαααα→~   ~   εαααα→~   ~   ~
%  	    %απα∀ααα$     %απα∀ααα$     %απα∀απα$
%  	      ↓             ↓             ↓   ↓
%  	     +           X             Y  NIL
%  
%  
\caption{Representation of S-expressions as list structures.}
\end{figure}

	Note that the list {\sx (+ X Y)} and the S-expression
 {\sx (+ . (X . (Y . NIL)))} are represented in memory by the same list
structure.  The simplest way to treat this is to regard S-expressions
as primary and lists as special kinds of S-expressions,
namely those that never have any atom but \qNIL\ as the second part of
a pair.  The list notation can then be considered as an
abbreviated way of writing these S-expressions,
Thus, the list notation {\sx (A B $\ldots$ Z)} may be regarded
as an abbreviation of the S-expression notation
{\sx (A . (B . (C . ( $\ldots$ (Z . NIL) $\ldots$ ))))}.
Note that the a-part of a list can be any S-expression,
but the d-part of a list must be a list and the only atomic list is \qNIL.
In giving input to \lisp, either the list form or the
S-expression form may be used for lists.  On output, \lisp\ will print
a list structure as a list as far as it can, otherwise as an
S-expression.  Thus, some list structures will be printed in a mixed
notation, e.g. {\sx ((A . B) (C . D) (3))}.  Some \lisp s use an ``extended''
list notation for printing S-expressions which minimizes the number of
parentheses and dots printed.   In this notation
{\sx (A . (B . (C . D)))} would print as {\sx (A B C . D)}.  
(Programs for reading and printing S-expressions in the two notations
are given in \chapref{machin}.)

\subsubsection*{Exercises}

\begin{list}{\arabic{list}.}{\leftmargin0pt \labelwidth -\parindent}

\item  If we represent sums and products as indicated above and
use {\sx (- X)}, {\sx (/ X Y)}, and {\sx (EXPT X Y)} as representations
of $-x$, $x/y$, and $x↑y$ respectively, then (a) what do the following lists represent?
%
$$\hbox{\sx (/ 2 (EXPT (+ X (- Y)) 3))}$$
$$\hbox{\sx (+ -2 (- 2) (* X (EXPT Y 3.3)))}$$
%
and (b) How are the expressions $xyz+3(u+v)↑5-3$ 
and $(xy-yx)/(xy+yx)$ to be represented?

\item In the above mentioned graph notation, what graph is
represented by the following list?
$$\hbox{\sx ((A D E F) (B D E F) (C D E F) (D A B C) (E A B C) (F A B C))}$$

\item Write the list {\sx (+ (* X Y) X 3)} as an S-expression.
This is sometimes referred to as ``dot-notation''.

\item  Write the following S-expressions in list notation to 
whatever extent is possible:
%
$$\vbox{\smallskip\halign{\hfil #&\quad \sx # \hfil\cr
		a.& (A . NIL)\cr
		b.& (A . B)\cr
		c.& ((A . NIL) . B)\cr
		d.& ((A . B) . ((C . D) . NIL)).\cr
}}$$

 
\item	 How would you represent polynomials in one variable as lists?
Can you think of more than one useful way?  How would you represent polynomials
in several variables?  Can you tell if two lists represent the same 
polynomial?


%  	#. Consider a data base of directory accounting information for
%  a computer center.  For each directory the following information is
%  available:  the account number, the project name, the names of programmers
%  using this directory, the protection status (a binary bit string), 
%  the names of all files on this directory, and for each file the date it 
%  was last written and name of the programmer who wrote it.
%  Represent this information using list structures.  Show how the information 
%  could be retrieved.  For example could you produce a list of all files
%  written last by a given programmer using your representation of the data?

%\item[]	(The next two mathematical exercises are unconnected with subsequent
%material in this book.)

%\item Prove that the number of ``shapes'' of S-expressions
%with $n$ atoms is $$(2n - 2)!/(n!(n - 1)!).$$  The two shapes
%of S-expressions with three atoms are 
%$$\hbox{\sx (A . (B . C))}\ {\rm and}\ \hbox{\sx ((A . B) . C)}.$$
%These numbers are called Catalan numbers.

%\item  The above result can also be obtained by writing $S = A + S \TIMES S$
%as an ``equation'' satisfied by the set of S-expressions with the
%interpretation that an S-expression is either an atom or a pair of
%S-expressions.  The next step is to regard the equation as the
%quadratic $S↑2 - S + A = 0$, solve it by the quadratic formula
%choosing the minus sign for the square root.  Then the radical is
%regarded as the $1\over 2$ power and expanded by the binomial theorem.  
%Manipulating the binomial coefficients yields the above formula as the coefficient
%of $A↑n$ in the expansion.  Why does this somewhat ill-motivated
%and irregular procedure give the right answer?

\end{list}